Conversation
chore: back-merge main into develop after 2.0.0 release
Move all packages/*/dist/ directories from committed artifacts to gitignored build output. dist/ is regenerated locally with `pnpm build` before running the CLI, MCP smoke, the library-API demo, or `pnpm validate`. - .gitignore: ignore ragmir-core/dist, ragmir-tts/dist (already ignored for app/landing/license-webhook); add *dist catch-all. - ci.yml: drop the `git diff --exit-code -- dist` step that enforced committed dist, since dist is no longer tracked. - AGENTS.md, CLAUDE.md, README.md, library-api-demo README: document that dist is gitignored and must be built locally; warn against `npx ragmir` for local testing (resolves the published npm package, not the working copy).
Replace the weighted-sum fusion (vector and BM25 scores divided by their max) with Reciprocal Rank Fusion, the standard hybrid-retrieval approach. Each candidate scores `weight / (RRF_K + rank)` per retriever it appears in, summed across retrievers, so the BM25 and vector score distributions never need calibration against each other. The vector retriever is weighted higher (0.7) than the lexical one (0.3) because, with the default local-hash embeddings, vector proximity is the more discriminant signal on small corpora; the lexical weight still lets exact- keyword evidence pull in candidates the vector retriever missed. - RRF_K = 60 (Cormack et al. 2009 constant). - Remove the now-unused weighted-sum helpers (vectorScore, normalizeScore) and the normalizeForMatch import left dead by the refactor. Retrieval recall stays at 1.0 on the sovereign-rag-demo golden set.
Above a 256-row threshold, automatically create an IVF_PQ index on the vector column after writing the table. Below the threshold, LanceDB keeps using an exact flat scan, which is optimal for small corpora and avoids wasted index- training work. - numPartitions ≈ sqrt(rowCount), clamped to [8, 1024] (LanceDB production heuristic). - numSubVectors = 16 (divides the 384-dim local-hash/mxbai-xsmall vectors). - index creation is idempotent (skipped if vector_idx exists) and best-effort (a training failure on edge-case dimensionality leaves the table usable via flat scan rather than failing the ingest). This unblocks query scalability beyond brute-force scan without changing the overwrite write path.
Close two confidentiality gaps and broaden provider coverage in the built-in redaction patterns: - credit_card: add a match-then-verify Luhn check (new RedactionPattern.verify field). Numeric runs that are not valid card numbers (version numbers, account IDs, hex runs) are left untouched instead of being over-redacted. - url_credentials: extend the pattern so both the username and the password are redacted. Previously only the password was stripped, leaking the username. - Add Stripe secret keys (sk_live/rk_live/sk_test), GitLab tokens (glpat-), and generic Bearer tokens. Order the more specific patterns before the generic api_token so they win on overlap. - Add an optional `verify: "luhn"` to the RedactionPattern type so custom patterns can opt into the same check.
…d use Several additive robustness and observability improvements, plus extraction of the CLI option parsers into a testable module: - config: make rawConfigSchema strict so unknown keys (typos) are rejected instead of silently ignored; warn on stderr when an env override (e.g. RAGMIR_TOP_K=abc) is invalid so operators notice a no-op override. - access-log: bound the log growth with a soft cap. When the file exceeds 10 MB, trim it to the most recent 50 000 lines before the next append, so a long-lived MCP server cannot grow it without limit or OOM a usage report. - embeddings: bound the Transformers.js pipeline cache to 3 entries with LRU eviction, and export clearTransformersCache(). destroyIndex now calls it so a re-ingest with a different embedding config does not pin stale ONNX weights. - cli-options: extract the pure option parsers (parsePositiveInt, parseNumber, parseRecallThreshold, audioEngine, audioAllowRemoteModels, audioLanguage, parseAgentInstallScope, parseAgentInstallMode) into a dedicated module so they can be unit-tested without importing commander. cli.ts imports them. parsePositiveInt now rejects fractional input like "1.5" instead of silently truncating via parseInt.
Close the test-coverage gaps the audit identified, raising the suite from 132 to 151 cases across 23 files: - destroy.test.ts (new): destroyIndex removed flag and access-log entry. - query.test.ts: ask() empty-sources and populated cited-retrieval branches. - store.test.ts: empty-text-files manifest round-trip, removal on empty, missing, malformed, and malformed-entry filtering; writeRows zero-rows dropTable and full re-write. - embeddings.test.ts: embedTexts([]) early return and clearTransformersCache. - ingest.test.ts: --rebuild forces a full re-index (reusedFiles === 0). - config.test.ts: strict() rejects unknown keys; non-object config rejected. - access-log.test.ts: retention trims past 10 MB; disabled logging writes nothing. - evaluate.test.ts: miss case (hit=false, bestRank=null, recall=0). - redaction.test.ts: Luhn pass/fail, URL username redacted, Stripe/GitLab/ bearer providers, obfuscation limitation documented. - cli.test.ts (new): all cli-options parsers incl. the MP3-without-engine confidentiality guard and agent scope/mode validation. - text.test.ts (new): tokenize/normalizeForMatch (the BM25 foundation).
…y-overhaul feat(core): RAG retrieval overhaul (RRF + IVF_PQ), redaction hardening, and test coverage
|
🎉 This PR is included in version 2.1.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release PR — develop → main
Promotes the RAG retrieval, security, and coverage overhaul to production. Merging this PR triggers the protected
Release npmworkflow, which runs semantic-release to derive the version from Conventional Commits and publishes@jcode.labs/ragmir-ttsthen@jcode.labs/ragmirto npm.Expected version bump
MINOR (e.g. 2.0.0 → 2.1.0), driven by the
featcommits (RRF fusion, IVF_PQ index, config hardening). No breaking public-API changes.What's in this release
dist/is now gitignored build output.Pre-merge verification
pnpm validateequivalent run locally (lint + audit + check + test 151/151 + build + smoke).zeroTelemetry=true,llmGeneration=false,redactionEnabled=true.After merge
The
Release npmworkflow publishes both packages. No local publish, no direct push to main.